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(54) Video user's environment 

(57) The user ( 1 02) communicates through a digitiz- 
ing writing surface (26) with the audio/video control 
apparatus (20). An on-screen display (32, 34) is gener- 
ated, providing the user (102) with a user environment 
in which a wide range of different tasks and functions 
can be performed. The digitizing writing surface (26) 
can be incorporated into a hand-held remote control 
unit (24) and the audio/video control apparatus (20) 
may likewise be incorporated into existing home enter- 
tainment or computer equipment. By tapping on the 
writing surface (26) a command bar (32) is presented on 
the screen, allowing the user (102) to select among var- 
ious functions. Included in these functions is an on- 
screen programming feature (158, 148), allowing the 
user to select programs for viewing or recording by entry 
of user-drawn annotations or commands via the writing 
surface (26). 
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Description 



Background and Su mmary of the Invention 



> ices, although it is by no means clear that * , ,cor «,;n ho »w * * L assortment of program content and serv- 

Providing a user interface for a complex system such as this is by no means a simple task E aw to , ,„ 

small remote control or digfe *S taL, ^ ° ' am " V '°° m ^ ttb ' e apoea,s less »»" a 

lions, complex systems can be t^trdled i „„ Th "" ,8h,hemeehan,smol P'<»"*n9hana < J ra „ n i„a Iuc . 

(words, symbols picJeTete! ™wS<TsZ«„J£ T fT te °' he ' °" n Instructions 

instructing the auoSeo system ^n^^T. ^"™" 0 "' **" complex conM """I""" such as 

see,* - svaitabCotaTsro^ P'^m. o, to 

instruction. The contro, \^ZZ Z Z ™7 ™ aua ' 0 ' v ' de ° COntrol ,unctions a ^ing to the user's selection or 
a television, or ^^cO^S^^^T^SS T * T 1°' C ° UP ' in9 * 3 Vktao diSp ' ay apparatus ' such as 
existing audio/vfdeo equipment" a can Te in^oroo a So ^ f 3PPara,US " be PaCka9ed S6paratel * 
a digitizing writing surface is p rov deo" for en ^ f hlnfj ^ COmp ° nents - A remote c ° n ™ apparatus having 

communicates with the i^SSS^^STl^. '"f"? °7 * ** UMr ^ r6m ° le C ° ntr0 ' 
implements TV remote cont^a S ^^?Sl!SS Alternatively, a full-featured personal digital assistant (PDA) that 

tus. Many commercial"" ^"^c* ^S^^ST* 3,50 ^ 35 ^ ramote COntr °' appara " 
The system further includes a d7oc^ so h a rlmm 7 * communi "ti°n. such as an infrared link. . 
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also possible to implement the invention using multiple processors, one associated with the audio/video control, and 
another associated with the remote control. The multiple processors work in concert as distributed processors to imple- 
ment the processing functions required by the invention. 

For a more complete understanding of the invention, its objects and advantages, refer to the following specification 
5 and to the accompanying drawings. 

Brief Description of the Drawings 

Figure 1 illustrates a first embodiment of the invention in which the audio/video control apparatus is packaged as a 
jo set top box, suitable for use with a simple television set; 

Figure 2 is another embodiment of the invention in which the audio/video control apparatus is packaged as part of 
a home entertainment system; 

75 Figure 3 is a close up perspective view of an exemplary remote control unit with digitizing writing surface; 

Figure 4 is a system block diagram showing the components of the invention together with examples of other com- 
ponents of audio/video equipment, illustrating how the invention is interconnected with this equipment; 

20 Figure 5 is a block diagram showing the hardware components of the audio/video control apparatus and remote 

control apparatus; 

Figure 6 is a block diagram of the presently preferred software architecture of the invention; 

25 Figure 7 is a diagram representing a screen snapshot, showing the command bar of the presently preferred user 

interface; 

Figure 8 shows the sign-in panel of the presently preferred user interface; 

30 Figure 9 shows an example of an ink search in the sign-in panel of the preferred user interface; 

Figure 10 illustrates stan:uvd television controls available for manipulation through the user interface by selecting 
the TV button on the command bar; 

35 Figure 1 1 illustrates an example of a TV channel search using approximate ink matching; 

Figure 12 shows a TV program schedule as presented through the user interface; 

Figure 13 shows a similar TV program schedule that has been limited to display only certain categories by manip- 
40 ulation through the user interface; 

Figure 14 shows a VCR control function display produced by selecting the VCR button on the command bar; 

Figure 15 shows an example of the video game quick access interface; 

45 

Figure 16 shows an example of the home shopping access interface; 

Figure 17 shows an example of the ink mail (l-mail) user interface; 

so Figure 18 is a flow diagram describing the ink data interpretation that forms part of the recognition system; 

Figure 19 is an entity relationship diagram illustrating the steps that the system performs in searching for a user- 
drawn entry or annotation; 

55 Figure 20 is a functional diagram illustrating the basic edit distance technique used by the preferred embodiment; 
and 

Figure 21 is another functional diagram illustrating how approximate matching may be performed with the edit dis- 
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tance technique. 
Description of the Preferred Emhnriimpnt 
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the system In its present,,. pre)errM ^ a " n ?,™ •«««•<*"« »* a detailed description ol 

includes an audio/video control irttM^SiS^ ^^^-^ ,n Rgure the invention 

The hand-held remote control 24 includes a SfaSf writh e ^ ^ des, 9 ned ,or P'acement atop a television 22. 
instructions using a suitable pen o sMus \l ?S? Th > f 6 0n Whi ° h ,he USer enter hancM***, 

conjunction wi.h'emote contS 24 a7d wouW ^S^^SZTT 1 ^ "° be SUbsUtUted ° r usa « - 

remote control 24 communicate with ol anil "f a f J,- , 9 SUrf3Ce a " d Stylus - Tne contro1 ™« 20 and 
ment. the audio/video contro.tittc,^ * 30 ^ 

television 22. In this way. the television 22 serves as a vid^o d ^.TJ ( > ° f C ° UP " n9 to tne Video ,n P° rt ° f 

projected. In Figure 1 the video user interface has beLlZ^TZ^ 5 ^ Wh ' Ch ,he Video user interf ^e * 
user interactive pane. 34. The ocZ^^Sl^^l ^ZZ** as a command bar 32 and a 

appropriate signals) with the existing t^SSS^- ^ ^T" ^ (by inC ' USi ° n ° f 

interface will be presented below If desired th™tr«f ♦ „„ V , telev,s,on tuner - F "» details of the video user 

ing and decoding radioTeo^ 20 7 V '"^ 3 te,evision tuner ^u.e suitable for receiv- 

video signals to the Video* pT o ^22^^^'^^ ^ SUPP ' ieS NTSC 

A more complex home ^^1^^^. 9 c USS the ' nt6mal tmer section of th e television 

essentially the sa'me as d^S^SES*^ F^e S^^iS?^ ? ^ M " 

unit for inclusion in the home entertainment svstem alnnn 1th L?L 3y be conf, 9 ur ed as a rack mount 

tration purposes, the home ^r^TsX^S^ZT^ ^T* 5 °' audio/vide ° equfcment. For illus- 
round sound speakers 38 subwooferTo LTr^ 3 ' ar9e SCreen Predion television 36. sur- 

inputs to which additiona. co ^ 42 The tuner/amplifier has video and audio 

tape player 44. VCR 46. laser Sisc player ^S^S^^^^T^' *"* * 3 di9ital audio 

that might be used with the present invention Also f th ^ , V 6XamP ' eS ° f the type of equipment 

persona, computer may be connected ^Z^^^^^ ^ The 

ponent in Figure 2 for illustration purposes However it i<; nnt „ J«ce a . , ShOWn as a se P ar ate corn- 

component as illustrated here. Rather th confroTuni m av Vlnr * * P? 0 ** 8 the control unit 20 as a separate 
including the television itself ™ y bS ,ncor P° rate d '«*> any of the audio/video components 

inah^d^^ 

tro. unit. The remote oontro. i^udes -b^S^Z keypad S^Rand rem0te COn " 
as well as selected other buttons for nm^m™ ~ numer,c Kev Paa 56. VCR and laser disc mot.on control buttons 58 
shuttle wheel 60 may a'so ^^^SSS^^^ °l ^ A 'numb-operated £ 

dial may be used in place of the thumb ! operL,SgTult!e SySt6m Al ^tive.y, a jog shult.e 

a pe^ xz^z^;^^^^ r i desi9n r to receive h3nd - drawn ^ 

surface to be flipped up to reveal a'dditonaT push ^^uttons Knea^ T^d f ^ ^ the ^ 

embodiment is a passive screen that accepts per^ ^roke inouH^nJ fT^l 26 ° f the P referred 

providing visual feedback on the writing sSnKST^iTS^f 8 ^ ^ deSCfibed below ) without 

the video screen. One skilled in the art wHI Zl ^L^lt \ S r emb0d,ment ' the visual Redback appears on 
arate tab.et unit which can be placed upo^ 3 SSSS ^ such T"- ^ 26 may be embodied in a se P" 

fortably. Alternatively, the dioitizina writino •..rfa^rK^,*!!,' aWMm *. the tab,et to be to more com- 

stroke input but also includes a writable disrilv ThT!^ ,m P lemented « an active screen that not only accepts pen 

An overview of the presem.y^? err ^ e m^ T" ^t™ ^ * ^ 50 th3t * may be viewed in «« 
20 and remote contro. 24 P^^^^^l?^^"^- F ^ * »es the contro. unit 

apparatus 64. As previously discussed the d!ph» ™v LTf?^ 3 ^ 62 f ° r C ° Uplm9 t0 3 video dis P |ay 
a flat panel display, a projection svstem or « SnS 2 ^ . f V telev,s,on set or television monitor, or it may be 
tion is provided by the telSion m ° n ' t0r - m ° St home entertainment systems the display fun C 
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technology that can be coupled to the audio/video control 20. In Figure 4 this other equipment is shown diagrammati- 
cally as other media 66. These media are preferably connected by conventional cabling 68 to the audio/video control 
20. The audio/video control thus operates as the audio/video signal switching and processing center for the system. For 
example, rf the user has selected the VCR 46 as the source of program content, the audio and video signals from the 
5 VCR are switched through audio/video control 20 and communicated through port 62 to display 64. In this regard, the 
audio/video control 20 is preferably capable of handling multiple tasks concurrently. Thus the laser disc player 48 may 
be selected as the current source of program material for presentation on display 64, while VCR 46 is taping a television 
broadcast for later viewing. The audio/video control may include a television tuner to supply the necessary audio and 
video signals to the VCR. 

w Whereas audio and video signal flow is routed between components using cabling 68. the control functions can be 
provided via an alternate link such as an infrared link. In Figure 4 an infrared transponder 70 provides this function. The 
audio/video control 20 sends a command to transponder 70 and the transponder broadcasts that command to each of 
the components in the system. The infrared command includes a device header indicating which of the components 
should respond to the command. In one embodiment, the infrared link is bidirectional, allowing components such as the 

75 VCR 46 or multimedia computer 52, to send infrared replies back to the audio/video control 20. However, the infrared 
link may also be unidirectional, as with current remote controls. There are, of course, other ways of communicating con- 
trol signals between the various components and the audio/video control 20. Infrared has the advantage of being com- 
patible with existing home entertainment equipment. By using infrared control, the audio/video control 20 is able to 
control the operation of home entertainment components that were designed before the advent of the present technol- 

20 ogy. Alternatively, the individual component may have infrared networking capabilities so that the remote control 24 can 
communicate directly with the components without having to go through the audio/video control 20. Thus the video user 
environment of the invention can be incorporated into existing systems, working with most of the user's existing equip- 
ment. 

The remote control 24 and control unit 20 preferably employ a form of distributed processing, in which each unit 

25 includes a processor that works in concert with the other. In Figure 4 this distributed architecture is depicted diagram- 
matically by processor 72. shown as being shared by or related to both the remote control 24 and the control unit 20. 
Although distributed processing represents the preferred implementation, the video user environment could be imple- 
mented by a system in which all of the processing power is concentrated in one of the remote control or control unit 
devices alone. For example, the remote control 24 could be constructed with minimal processing power and configured 

30 to simply relay all hand-drawn instructions of the user to the control unit 20 for interpretation. Such a configuration would 
require. a higher data transfer rate between the remote control 24 and control unit 20. An alternate embodiment places 
processing power in the remote control 24, so that user-entered, hand-drawn instructions are interpreted in the remote 
control unit, with higher level instructional data being sent to the control unit 20 for further processing. 

Figure 5 shows the hardware architecture of the preferred implementation. The components of the remote control 

35 unit 24 and the audio/video control unit 20 are shown in the dotted line boxes numbered 24 and 20, respectively. The 
remote control unit includes a processor 72a having local random access memory or RAM 74 as well as read only 
memory or ROM 76. While these functions are shown separately on the block diagram, processor 72a, RAM 74, ROM 
76 and various other functions could be implemented on a single, highly integrated circuit using present fabrication 
technology. Coupled to the processor 72a is an infrared interface 78. The remote control unit 24 may optionally include 

40 a push-button display 77 which provides visual feedback via various light functions and a push-button keypad 79 for pro- 
viding input to control unit 20. Push-button keypad 79 could have preprogrammed functions or may be programmed by 
the user, including a learning function which would allow keypad 79 to take on universal functions. Remote control 24 
may also be provided with a microphone interface 81 for receiving spoken commands from the user. One skilled in the 
art will appreciate that processor 72a or 72b may implement well-known voice processing technology for interpreting 

45 spoken commands into computer instructions. The remote control unit 24 also includes a digitizing writing surface com- 
prising tablet interface 80 and tablet 82. The tablet interface 80 decodes the user-entered, hand-drawn instructions, 
converting them into positional or spatial data (x,y data). Processor 72a includes an internal clock such that each x,y 
data value is associated with a time value, producing a record of the position of the pen or stylus as it is drawn across 
tablet 82. This space/time data represents the hand-drawn instructions in terms of the "ink" data type. The ink data type 

so is a defined data type having both spatial and temporal components (x.y.t). The ink data type is described more fully 
below. 

The audio/video control unit 20 also includes a processor 72b having associated RAM 86 and ROM 88. Processor 
72b is also provided with an infrared interface 90. Infrared interface 90 communicates unidirectionally or bidirectionally 
(depending on the embodiment) with infrared interface 78 of the remote control 24. In addition to the infrared interface. 
55 processor 72b also includes video interface circuitry 92 that supplies the appropriate video signal to the video out port 
62. 

Much of the video user environment is preferably implemented as software that is executed by the distributed proc- 
essor architecture 72 (e.g. 72a and 72b). The architecture of this software is depicted in Figure 6. The software can be 
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transmission media, including but rJ7nSS^^^SS^lT "IT™ RAM 74 3nd RAM 86 over ™>™ 
also delivers the video signals. telephone lines, fiber optic cable or the television cable that 

"esXar^ 

functions depicted generally at 106 d ,he hardW3re 104 7,16 soflwar * P"»Wes each of the 

hardware 104. The hardware abstraction l^1SS?C^ l SX!^ ,he t0 «*« 

television tuners, supporting video and araohics priantor hlrH re,ated J ssues such as implementing timers, tuning 
era.s. The harc^areTbstracW^ ^de ZTrT 9 ,UnCti ° nS and °P eratina ^ 

One level above the hardwar abstraction laverT th^m wl! n , e , cessar y dsv '<* driver for the tablet interface 80. 
rea. time operating system for the vSS^eS^S ZT*«™Z"r ^ ^ S6rves 35 the 

-oplayerand multiuser " ru^n^^ 
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The preferred video user interface, generated by user interface laver n 4 i echrt - ^ 
-ocat^^^^^^ 

=aS^^^ 

ma, viewing operation the video piS^TS tr^STS^ 5^£^S!rf ^ 
wants to access the video user environment functinnafitu thl Comma " d bar 32 ,s n °« Present. When the user 

once anywhere on the digitizing ,ab^^^ *" C ? mmand bar 32 b * tapping the pen 

much o? the 6 ZXZ ^^e^ a "f ^ «— « ly However, 

a user might draw a short descriptive ^Z L^XlTe **** tUM ^ —P 1 * 

in Figure 8 through which the user may sign J, The o!^n^ " T 9 * ° nCe - ™ S brings U P a P anel sh °™ 
are displayed: a text string 1 22 arKZ2^lr*2S^T^ a T?* 7° 0 ° "*** tao typeS of information 
string and its associate ink region 4Tll!SS2^rfnL^ ^ 8aCh USer is svm bo^ed by the text 

text string JZ identifies the user who has J^^™?^-^ 8 ^^ * e ^ R 9 ure 8 * e 

strained: it can be a picture, a doodl a siSe a 3 writfenTn "„", * "* re9i0a The rS9i0n iS entirel V unc °"" 
between the ink region and the text string sue XauZZTn" * 8nd 50 ^ There is ex P licrt bind ing 

identifying a single individual. rtTJ^J^T^tT^ * SyStem and ,he user a * 

a tuple. This same paradigm carries throuoh a 1 2 ^ 9 f ° rmS 3 data Structure often referred to as 

Once the Sign'n pane, is on^ e^^ 
pletes the action, logging in the user as the S5^2 VmZIi*^" 9 °" *' TaPP ' n9 the " D ° lt! " button «™- ' 
searching feature of the invention discusLdbrow rSf' ^"f 6 '* the user ma * sea '<* for a specific ID using a 
thus the user does not need to sign ^J£^,^ nfl appr0ximate irlk etching technique, 

date normal handwriting variations. ^ eaCh * me - ^ SyStem is " exible en °"9h to accommo- 

The system is capable of oerforminn tL ^ . - b " ' S ° n ' y act,ve when an 10 is selected, 

tion. By tapping on the Search button Z a SS^T^ °" ' ^ ^ haM ™ a ^ 

enters a hand^rawn entry or annotation in fhe ink reoio^ m£ ^^fl," 35 ' m ^ 9 The user 

stored as user IDs The aooroximatP inkZ ?! 9 . d fh ' S entry ,s com Pa r ed with the ink data previously 

the user .1st 120 as idenWies the match and high^s it in 

next best match by typing the "Find" butonTS^cSn ^ SoS ^ ' S the USer may P roceed to the 

As an alternate searching techniqut the ^^^^J^^TT ^ dS8ired ,D '* found - 
^.sdonebytypingthedesiredtextstringu^aTo^^^ 
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keyboard icon preferably appears as a standard QWERTY keyboard resembling a conventional keyboard found on a 
personal computer. When the keyboard is used to enter a text string, the system finds an exact match in the list of IDs 
by searching for the character string entered by the user. Like the ink search, the text matching search can also be 
approximate. Thus if the user enters the query "ddl" the text string "dpi" would be considered a better match than the 
5 text string "jeff." 

After the user has signed in with the user list screen, a briefly displayed confirmatory screen is projected showing 
the text and ink data representing the ID through which the user has signed in. Also, if desired, the time of day may also 
be momentarily displayed. After the confirmatory screen has been displayed for a suitable length of time (e.g. five sec- 
onds) it disappears, leaving only the current video screen visible. In the event the user chooses not to sign in, the sys- 

io tern assumes that the last entered user ID is applicable by default. 

The video user environment of the invention provides a full complement of standard television controls such as vol- 
ume, balance, brightness, color and so forth. In addition, an on-screen keypad is available for changing channels by 
direct entry of the numeric channel number or by "surfing" up and down the dial by clicking suitable up and down but- 
tons. The standard television controls are presented by tapping the TV button 136 on command bar 32. 

is The presently preferred implementation continues to use the traditional remote control push buttons for performing 

standard television control functions such as those listed above. For continuity and maximum flexibility, these same 
functions are duplicated on screen through the video user interface. 

Although the video user interface provides the same ability to control standard television control functions as the 
traditional remote control, the video user interface of the invention goes far beyond the traditional remote control. The 

20 invention provides sophisticated tools to help the user manage his or her video programming. Figure 10 shows the tel- 
evision control panel 138 that is displayed when the TV control button 136 is tapped. The numeric keypad 140 is used 
to enter television channels directly and the up and down buttons 142 sequentially surf through the channels in forward 
and backward directions. By tapping on the channel list button 144 brings up a scrollable list of channels with handwrit- 
ten annotations as illustrated in Figure 11 . As with the sign in panel, it is possible for the user to select an item manually 

25 or search for an item using the approximate ink or text matching techniques. In this case, the numeric pad 140 
(accessed by tapping on the appropriate numeral icons) limits the user to numeric input (i.e. TV channels). Tapping on 
the "Schedule" button 146 displays a convenient television schedule illustrated in Figure 12. The preferred implemen- 
tation portrays the TV schedule in the form of a traditional paper-based television guide. It has the distinct advantage, 
however, of knowing what time it is. Thus, the TV schedule screen (Figure 12) highlights programs currently playing, to 

30 assist the user in making a choice. Thus the TV schedule of Figure 12 is. an active schedule capable of highlighting 
which are current programs, updating the display in real time. In Figure 12 the active programs are designated by dotted 
lines at 148 to indicate highlighting. The present invention carries the concept of active scheduling one step further, 
however. Each program in the display is tagged with a predefined icon indicating its genre. Thus news, sports, drama, 
comedy, kids and miscellaneous may be designated. The user may limit the TV schedule to display only those pro- 

35 grams in certain genres by tapping the "Clear All" button 150 and by then activating one or more of the check boxes in 
the category pallet 152. In the example shown in Figure 13. the user has elected to limit the display of programs in the 
sports, comedy and kids categories. This feature in the video user environment makes it much easier for the user to 
identify which programs he or she wants to watch. 

Finally, the TV schedule allows the user to program the TV to change channels at specific times automatically. Thus 

40 the user does not miss an important show. Unlike programming of current VCRs, which can be complicated and frus- 
trating,, programming in the video user environment is handled in a highly intuitive way. The user simply taps on a show 
displayed in the schedule (such as "World Series** in Figure 13), thereby highlighting it. Then, at the appropriate time, 
the video user environment switches to the proper channel (in this case channel 2). As with all video user environment 
applications, ease of use is key. 

45 The foregoing has described how the video user environment may be used to access and control television. Similar 

capability is provided for other audio and video components such as the VCR. Figure 14 depicts the VCR control panel 
154 that is displayed when the VCR button 156 is tapped. The VCR control panel provides traditional play, stop, pause, 
rewind and fast forward control. In addition, if the VCR equipment is capable of such functionality, the VCR tape can be 
indexed forward or backward on a frame-by-frame basis. Similar capabilities can be provided for controlling laser disc 

so players, for example. 

As best illustrated in Figure 14. tapping the "Program" button 158 calls up a display visually identical to the TV 
schedule display of Figure 12. However, the TV schedule and the VCR schedule are maintained as separate data struc- 
tures, so that the user may program the TV and VCR independently. Using the same visual displays for different but 
comparable functions is one way the presently preferred implementation makes the system easier to use. By reusing 
55 the same icons and tools (including the same window layouts, locations and function of buttons) speeds the learning 
process, as the user only needs to have experience with one instance of the tool to know how to apply it in its other 
settings. This also makes the video user environment application smaller, as code can be shared among several func- 
tions. 
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in n J? c" 9 ^ " Ubrary " bUtt ° n 160 (R9Ure 14) brin 9 s U P y« another br °wser displaying text and ink annotations 
m pa, s. S.milar m appearance to the channel list of Figure 1 1 , the video library displays JtriL ^ ^™ d ^ST 

iiDrary ana select an archived event by tapping on it. This would in turn cause the video on demand ^te-m tn ™ m 

fa ^T aPPn19 m ? " Games " button 162 < Fi 9 ure 14 ) b "n9s up a window (Figure 15) that provides a quick and easv intor 

Tnl£ I T (6Ven 3 ChNd) l ° aCC6SS 8 Variety 0f on -' ine 9 ames - Some of these games may involvfo^er pTave* on 
a network. The presently preferred embodiment of the video user environment does not direct* Tmplement SrtZZ 

,s cont ?T ,ated that such games would be sup p ,ied * comme ™' -oiiS^SKr^SS 

sST 9 ' nterfaCe S,mP ' y diSP ' ayS 3 P,Ura ' ity ° f iCOnS 10 represent each ° f avai,ab,e games on ■ £ useS 

ontillf h" 9 ?" I? " Sh ° ppin9 " ^t* 0 " 164 calls U P a display of home shopping options (Figure 16) Preferably each 
opt-on ,s d.sp.ayed as a separate icon that the user may tap on in order to access those shopping services K desS 

^itTe^^^^ 

rnntS Pin9 K° n ^ "I'^Tl 166 (ink " mai,) pravides the user with an mail communication system In 

ul*Ts?r££ZT° na ^ SyS,6mS th3t r6,y ° n ^^-entered text, the video user environment aSs the 
user to send hand-drawn or handwritten messages. The l-mail interface (Figure 17) oreferablv nrovirt^ ar ^Z? 

caZ Tsrr ^ draw handwrrtten messa9es that ^ then be sLz:zztnT: r z£5££z£^ 

cat.on networkto a rec,p,ent. These handwritten messages allow for more personalized correspondence and ^^1" 
access.ble than typed electronic mail. Additionally, writing with a pen is moVe P o W e^ ^ ^^1^ ^ 

^h^f | d,scussed ^ above - tne ^deo user environment has access to a system clock whereby the TV schedule and VP R 

sks^ srs crxr ou,e ,4 ' ™' te — » - - ■ — * sssrs 

Preferred Ink Search and Retrieval Technology 
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the actual embodiment, the extracted feature vectors are represented as numerical data that is stored in the computer. 
As indicated at 206. each extracted feature vector is classified according to a predetermined code book 210. The pres- 
ently preferred embodiment stores 64 clusters of stroke types, each cluster being represented by its centroid or average 
stroke of that type. As in the case of the extracted feature vectors (block 204) the feature vector clusters are stored as 

5 numerical computer data. In Figure 18 the data comprising code book 210 are shown graphically (instead of numeri- 
cally) to simplify the presentation. In Figure 18 note that the horizontal line segment of block 206 most closely matches 
the centroid 21 2 of the Type 2 stroke cluster 214. Thus in the output string (block 21 6) the VQ code 2 is used to repre- 
sent the horizontal line in block 206. In block 216 the leftmost numeral 2 corresponds to the leftmost horizontal line 
stroke. The remaining codes represent the remaining ink strokes comprising the original incoming ink data. 

io Through the above-described procedure the incoming ink data is converted, pen stroke by pen stroke, into a feature 
vector that corresponds to each individual pen stroke. The set of feature vectors which collectively represent a series of 
pen strokes are stored in the computer database as the user-drawn annotation. This is depicted at 218. 

To further illustrate, a software block diagram of the presently preferred embodiment is shown in Figure 19. The 
annotation system operates on digitized pen stroke data that is ultimately represented as an "ink" data type. As will be 

75 illustrated, it is not necessary to convert the ink data type into an ASCII character data type in order to perform the 
search and retrieval procedures. Indeed, in the case of graphical (nontext) annotations, conversion to ASCII would have 
no meaning. Thus, a significant advantage is that the annotation system operates in a manner which allows the "ink" 
data to be language-independent. 

Illustrated in Figure 19. the user-drawn query 300 is captured as a string of (X,Y) ink points, corresponding to the 

20 motion of the pen tip over the surface of the digitizing tablet or pad as the user draws query 300. The presently preferred 
embodiment digitizes this information by sampling the output of the digitizing pad at a predetermined sampling rate. 
Although a fixed sampling rate is presently preferred, the invention can be implemented using a variable sampling rate, 
as well. By virtue of the digitized capture of the X,Y position data, both spatial and temporal components of the user- 
drawn pen strokes are captured. The temporal component may be implicit information - the ordering of sampled points 

25 relative to one another conveys temporal information. Alternatively, the temporal component may be explicit - the exact 
time each point was sampled is captured from an external clock 

In the presently preferred embodiment, employing a fixed sampling rate, each X.Y data point is associated with a 
different sampling time. Because the sampling rate is fixed, it is not necessary to store the sampling time in order to 
store the temporal data associated with the pen stroke. Simply recording the X, Y position data as a sequence automat- 
ic ically stores the temporal data, as each point in the sequence is known to occur at the next succeeding sampling time. 

In the alternative, if a variable sampling rate system is implemented, (X,Y,T) data is captured and stored. These 
data are the (X, Y) ink points and the corresponding time T at which each ink point is captured. 

The raw ink point data is stored in data store 302. Next, a segmentation process 304 is performed on the stored ink 
point data 302. The presently preferred segmentation process searches the ink point data 302 for Y-minima. That is, the 

35 segmentation process 304 detects those local points at which the Y value coordinate is at a local minimum. In hand- 
drawing the letter "V" as a single continuous stroke, the lowermost point of the letter "\T would represent a Y-minima 
value. 

Segmentation is performed to break the raw ink point data into more manageable subsets. Segmentation is also 
important for minimizing the variation in the way the users produce ligatures; the connection of characters or even 

40 words. These segment subsets may be designated using suitable pointers to indicate the memory locations at which 
the Y-minima occur. In this case, these segmentation pointers may be stored at 306 to be associated with the ink point 
data 302 previously captured. In the alternative, if desired, the segmented data may be separately stored in one or more 
memory buffers instead of using pointers. 

Once the raw data has been segmented the individual segments or pen strokes are operated on by a set of extrac- 

45 tion functions 308. The presently preferred embodiment operates on the pen stroke (segment) data using 13 different 
extraction functions. These extraction functions each extract a different feature oUhe pen stroke data that are then used 
to construct a feature vector. Table I lists the presently preferred features that are extracted by the extraction functions 
308. For further background information on these extraction functions, see Rubine, Dean, "Specifying Gestures by 
Example," Computer Graphics. Vol. 25. No. 4, July 1991. The feature vectors of a given stroke are diagrammatically 

so represented in Figure 19 at 310. 
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It will be appreciated that in most cases the user will not draw the same annotation in precisely the same way each 
and every time. That is, the (X.Y.T) coordinates and temporal properties of a given annotation may vary somewhat, 
each time the user draws that annotation. The presently preferred system accommodates this variation first by the man- 
ner in which the vector quantization is performed. Specifically, the vector quantization process 312 assigns each input 
stroke to the predetermined vector 315 from the user -dependent stroke types 314 that represents the closest match. 

After each of the strokes representing the query has been processed in this fashion, a comparison is made 
between those strokes and the user-drawn annotations that have been stored in association with the documents in the 
database 320. Thus, for example, the query "important" may be compared against the stored annotation "this is very 
important!" An edit distance analysis is performed to make this comparison. 

Shown as edit distance analysis process 318, the query stroke type string is compared with each of the stored 
annotation stroke type strings 321 of the database 320. The edit distance analysis compares each stroke type value in 
the query string with each stroke type value in each of the annotation strings. A edit distance computation is performed 
by this comparison, yielding the "cost" of transforming (or editing) one string into the other. The individual string/string 
comparisons are then ranked according to cost, with the least cost resultants presented first. In this way, a sorted list 
comprising all or the n-best matches is displayed in the thumbnail sketches of the main browser screen. Alternatively, 
rather than showing a sorted list, the user may be shown the best match on the main browser screen. If the user deter- 
mines that this match is not correct, the user may tap the "Next" button (not shown) to see the next best match. 

Figure 20 shows the basic edit distance technique. In this case, the stored annotation "compress" is compared with 
the query string "compass." It should be understood that Figure 20 depicts the comparison of two strings as a compar- 
ison of individual letters in two differently spelled words. This depiction is intended primarily to aid in understanding the 
edit distance computation technique and not necessarily as a depiction of what two stroke type strings might actually 
look like, in this regard, each of the 64 different stroke types may be arbitrarily assigned different numerical labels. Thus 
the edit distance computation would compare the respective numeric labels of the stored annotation and the input 
query directly with each other. There is no need to convert the individual strings into ASCII characters and Figure 20 is 
not intended to imply that such conversion is necessary. 

Referring to Figure 20. each time the annotation string stroke value matches the query string stroke value a cost of 
zero is assigned. Thus in Figure 20. a zero cost is entered for the comparison of the first four string values "comp." To 
accommodate the possibility that a string/string comparison may involve insertion, deletion or substitution of values, a 
cost is assigned each time an insertion, deletion or substitution must be made during the comparison sequence. In the 
example of Figure 20, the query string "compass" requires insertion of an additional value V after the value "p." A cost 
of one is assigned (as indicated at the entry designated 422). Continuing with the comparison, a substitution occurs 
between the value "e" of the stored annotation string and the value "a" of the query string. This results in an additional 
cost assignment of one being added to the previous cost assignment, resulting in a total cost of two. represented in Fig- 
ure 20 at 424. Aside from these insertion and substitution operations, the remainder of the comparisons match, value 
for value. Thus, the final "cost" in comparing the annotation string with the query string is two. represented in Figure 20 
at 426. 

In the preceding discussion, a first minimum cost path was described in which "compass" is edited into "compress" 
by inserting an "r" and substituting an "e" for an "a." An alternative edit would be to substitute an V for an "a" and insert- 
ing an "e." Both of these paths have the same cost, namely two. 

Figure 21 gives another example of the edit distance computation technique. As before, strings of alphabetic char- 
acters are compared for demonstration purposes. As previously noted, this is done for convenience, to simplify the illus- 
tration, and should not be interpreted as implying that the strings must be first converted to alphanumeric text before 
the comparisons are made. Rather, the procedure illustrated in Figures 20 and 21 are performed on the respective 
stroke data (vector quantized symbols) of the respective stored annotation and input query strings. 

Figure 21 specifically illustrates the technique that may be used to perform an approximate match (word spotting). 
In Figure 21 the stored annotation "This is compression," is compared with the query string "compass." Note how the 
matched region 430 is extracted from the full string of the stored annotation by scanning the last row of the table to find 
the indices that represent the lowest value. Note that the first (initializing) row in Figure 21 is all 0s - this allows the 
approximate matching procedure to start anywhere along the database string. 

The presently preferred edit distance procedure is enhanced over the conventional procedures described in the lit- 
erature. In addition to the three basic editing operations (delete a character, insert a character, and substitute one char- 
acter for another), it is useful to add two new operations when comparing pen stroke sequences. These new operations 
are "split" (substitute two strokes for one stroke) and "merge" (substitute one stroke for two strokes). These additional 
operations allow for errors made in stroke segmentation and generally leads to more accurate results. 

The use of our enhanced edit distance procedure is illustrated in Figure 21 . In Figure 21 the split operation is used 
to substitute the letters "re" in "compress" for the letter "a" in "compass." Note that the backtracking arrow in Figure 21 
spans one row but two columns, thereby signifying the multicharacter (merge) substitution. Hence the edit distance is 
one, not two, in this case. By way of comparison, Figure 20 illustrates the basic edit distance algorithm without utilizing 



11 



EP 0 838 945 A2 

thejwo new mu.ticharacter operations. Thus the cost (as depicted in Figure 20, of editing "compass" into "compress" 
The above-described procedure works well in most user-drawn annotation applications The combined use of v »r 

incorporated into existing home entertainment or computer equipment. By tapP^g on he XS^^. 
Claims 

1 - An audio/video system having an enhanced video user environment, characterized by: 

an audio/video control apparatus (20) for selectively performing predetermined audio/video control functions 

ess isr,iti^:~£r •* ,emo,e con,,a wus < 24 > c °~°* »— • 

1 2E!S£2 SSSZi: * *""■*" in — ««- ? w . 

' Sa'STlSr 1 ' 01 aaiTC ' ■ 5 ' »»' *"« w,»,n 9 surface ,26) is responsive to a to* 

■- The svstem ol anv of claims 1 - 6. characterized in thai said digitizing writino surface (26) is responsive to the 
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fingertip. 

8. The system of any of claims 1 - 7, characterized in that said audio/video control apparatus (20) includes at least 
one control port for coupling to at least one component (36-52, 66) of audio/video equipment and wherein said 

5 audio/video control apparatus (20) includes a control module for issuing (68; 70) control signals through said con- 

trol port to said component (36-52; 66) of audio/video equipment. 

9. The system of claim 8, characterized in that said component (36-52; 66) of audio/video equipment is a component 
selected from the group consisting of television (36; 22), video cassette recorder (VCR) (46), audio tape recorder 

io (44), audio disc player (48), video disc player (48), audio amplifier (42), surround sound processor (38, 40), video 
signal processor, camcorder (50), video telephone, cable television signal selector, satellite antenna controller, 
computer (52), CD-ROM player, photo CD player, video game player and information network access device. 

10. The system of any of claims 1 - 9. characterized in that said processor (72b) is disposed in said audio/video control 
75 apparatus (20). 

1 1 . The system of any of claims 1 - 9, characterized in that said processor is attached to said audio/video control appa- 
ratus (20). 

20 12. The system of any of claims 1 - 9, characterized in that said processor (72a) is disposed in said remote control 
apparatus (24). 

13. The system of any of claims 1 - 9, characterized in that said processor (72) comprises a multiprocessor system 
(72a, 72b) having a first portion (72b) disposed in said audio/video control apparatus (20) and having a second por- 

25 tion (72a) disposed in said remote control (24). 

14. The system of any of claims 1-14. characterized in that said audio/video control apparatus (20) includes an inte- 
grated television tuner for tuning a user selected channel carrying program information and providing a video signal 
representing said program information to said video display apparatus (64; 36; 22). 

30 

15. The system of any of claims 1 - 14, characterized in that said video display apparatus (64; 36; 22) is a television 
(36; 22) and wherein said audio/video control apparatus (20) outputs a video signal through said port, preferably 
an NTSC or PAL or HDTV signal. 

35 16. The system of any of claims 1-15, characterized in that said audio/video control apparatus (20) is incorporated 
into a component of audio/video equipment. 

17. The system of claim 16, characterized in that said component of audio/video equipment is a component selected 
from the group consisting of television (36; 22), video cassette recorder (VCR) (46); audio tape recorder (44), audio 

40 disc player (48), video disc player (48), audio amplifier (42), surround sound processor (38, 40), video signal proc- 
essor, camcorder (50), videotelephone, cable television signal selector, satellite antenna controller, computer (52), 
CD-ROM player, photo CD player, video game player and information network access device. 

18. The system of any of claims 1-17, characterized in that said processor (72) includes a speech recognizer module. 

45 

19. The system of any of claims 1-18, characterized in that said processor (72) generates at least one menu (32, 34) 
of user selectable system control options (136, 156. 160-168) and said audio/video control apparatus (20) issues 
a signal through said port (62) to display said menu (32, 34) on said video display apparatus (64; 36; 22) coupled 
to said port (62), wherein, in case said processor (72) is a multiprocessor system (72a, 72b), at least one processor 

so of said multiprocessor system (72a, 72b) generates said at least one menu (32, 34). 

20. The system of any of claims 1-19, characterized in that said processor (72) is coupled to memory means (74. 76, 
86, 88) for storing user input. 

55 21 . The system of claim 20. characterized in that said user input comprises handwritten annotations drawn on said dig- 
itizing writing surface (26). 

22. The system of claim 21 , characterized by an on-demand video interface (158. 148) whereby said handwritten anno- ' 
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* r6Ca " 3 Pre,eco ' ded e""™*™"" P'°»am for presentation „„ said video dismay apparatus 
24. The system of any of claims 1 - 23. characterized in that 

sa« ™»p ro cesso, system (72a, 72b) communicating bet»wn said audioMdeo cowol araarat™ (20) and 
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